Goto

Collaborating Authors

 conference sheraton cavalier saskatoon hotel


Classification of Safety Events at Nuclear Sites using Large Language Models

arXiv.org Artificial Intelligence

An SCR that is assessed as relevant to safety goes through extra scrutiny to maintain personnel safety at the nuclear station. The current method of SCR classification is a manual one that involves human evaluators to examine multiple SCRs every week. These records, which may be submitted by any employee, cover a broad spectrum of events and undergo management review to determine an appropriate reaction. If an SCR is deemed relevant to safety, it undergoes further examination by the Health and Safety department and is documented in a specialized database. The SCR database encompasses a range of occurrences, from equipment malfunctions and delays in material delivery to staff missing training sessions, making it cumbersome for the Health and Safety department to sift through each SCR to identify safety-related items before transferring pertinent details into their safety tracking system. The aim of this project is to develop a machine learning classifier to automatically differentiate between safety-related and non-safety-related SCRs. While this tool is not intended to supplant human assessment, it will serve as an additional layer of scrutiny and facilitate the swift review of safetyrelated SCRs by triggering a pipeline that copies all relevant data into the safety system for final human verification.


Evaluating ChatGPT on Nuclear Domain-Specific Data

arXiv.org Artificial Intelligence

This paper examines the application of ChatGPT, a large language model (LLM), for question-and-answer (Q&A) tasks in the highly specialized field of nuclear data. The primary focus is on evaluating ChatGPT's performance on a curated test dataset, comparing the outcomes of a standalone LLM with those generated through a Retrieval Augmented Generation (RAG) approach. LLMs, despite their recent advancements, are prone to generating incorrect or 'hallucinated' information, which is a significant limitation in applications requiring high accuracy and reliability. This study explores the potential of utilizing RAG in LLMs, a method that integrates external knowledge bases and sophisticated retrieval techniques to enhance the accuracy and relevance of generated outputs. In this context, the paper evaluates ChatGPT's ability to answer domain-specific questions, employing two methodologies: A) direct response from the LLM, and B) response from the LLM within a RAG framework. The effectiveness of these methods is assessed through a dual mechanism of human and LLM evaluation, scoring the responses for correctness and other metrics. The findings underscore the improvement in performance when incorporating a RAG pipeline in an LLM, particularly in generating more accurate and contextually appropriate responses for nuclear domain-specific queries. Additionally, the paper highlights alternative approaches to further refine and improve the quality of answers in such specialized domains.